p <- qplot(x, y, data = bivar); p
December 9, 2016
p <- qplot(x, y, data = bivar); p
p + geom_smooth(method = "lm")
p + geom_smooth(method = "lm", formula = y ~ poly(x, 5))
p + geom_smooth(method = "lm", formula = y ~ poly(x, 20))
p + geom_smooth(method = "gam", formula = y ~ s(x))
p + geom_smooth(method = "loess") ## Actually the default
The geom_smooth function easily adds misc. model fits or scatter plot smoothers to the scatter plot.
The stat_smooth function is reponsible for delegating the computations. See ?stat_smooth.
Spline smoothing is performed via the gam function in the mgcv package, whereas loess smoothing is via the loess function in the stats package.
Any "smoother" can be used that supports a formula interface and has a prediction function adhering to the standards of predict.lm.
Implementation assuming \(y\) in correct order.
runMean <- function(y, m) {
n <- length(y)
k <- 2 * m + 1
y <- y / k
s <- rep(NA, n)
s[m + 1] <- sum(y[1:k])
for(i in (m + 1):(n - m - 1))
s[i + 1] <- s[i] - y[i - m] + y[i + 1 + m]
s
}
(See filter for a much faster alternative.)
geom_smooth.rMean <- function(..., data, m = 10) {
ord <- order(data$x) ## Reordering if necessary
structure(list(x = data$x[ord], y = runMean(data$y[ord], m = m)),
class = "rMean")
}
predict.rMean <- function(object, newdata, ...) approx(object$x, object$y, newdata$x)$y ## Linear interpolation
p + geom_smooth(method = "rMean", se = FALSE, n = 200)
traffic <- read_csv("http://nielsrhansen.github.io/Dong/BUinternet.txt")
traffic <- traffic %>% select(-url) %>% filter(size > 0) %>% mutate(speed = size / time) %>% sample_n(10000) ## Subsampling data
p <- ggplot(traffic, aes(x = size, y = speed)) + geom_point() + scale_x_log10() + scale_y_log10() + facet_wrap(~ `machine name`)
traffic <- traffic %>%
filter(`machine name` %in% c("animal", "beaker", "bugs", "bunsen"))
p <- ggplot(traffic, aes(x = size, y = speed)) + geom_point() + scale_x_log10() + scale_y_log10() + facet_wrap(~ `machine name`)
p + geom_smooth(se = FALSE) + geom_smooth(method = "lm", se = FALSE, color = "red")
traffic <- traffic %>% filter(`machine name` == "animal")
p <- ggplot(traffic, aes(x = size, y = speed)) + geom_point() + scale_x_log10() + scale_y_log10() + geom_smooth(method = "rMean", se = FALSE, n = 200)
This time using our own smoother, and
library(plotly)
the plotly package for producing an interactive plot.
ggplotly(p)